stochastic dominance
Clustering based on Stochastic Dominance with application for risk averters and risk seekers
Li, Hua, Jia, Xue, Kang, Yilin, Wong, Wing-Keung
Stock clustering algorithms play a pivotal role in quantitative finance and the asset management industry, serving as a core mechanism for understanding market complexity and conducting asset preselection. Their intrinsic value lies in enabling investors to identify the true underlying structure of the stock market, thereby categorizing stocks with similar return characteristics or risk profiles into distinct groups. This data-driven market segmentation not only significantly reduces the computational dimensionality involved in portfolio construction but also provides a solid foundation for formulating differentiated investment strategies. A review of existing literature reveals that scholars both domestic and international have achieved fruitful results in stock clustering. Traditional clustering research predominantly employs classic machine learning algorithms: Xiaojun (2019) and Wu et al. (2022) utilized the K-means algorithm for stock partitioning; Huang et al. (2010) and Lu et al. (2020) explored the sectoral structures of the SSE 50 Index and other markets based on Agglomerative Hierarchical Clustering (AHC) and Spectral Clustering; Korzeniewski (2018) further introduced the Partitioning Around Medoids (PAM) algorithm to construct portfolios with enhanced risk resistance. In recent years, with the advancement of deep learning, L ucio and Caiado (2022) and Siregar and Yosia (2024) have attempted to incorporate time-series models (such as TGARCH) or specific market features (e.g., Indonesian stock data) into clustering frameworks. However, despite their respective merits in capturing market trends, these methods share a common limitation: traditional stock clustering approaches predominantly rely exclusively on stock-specific information (e.g., price, volatility, or financial metrics), neglecting the heterogeneity of market participants--namely, the "investors". In reality, investors are typically categorized into three distinct types based on their risk preferences: risk-averse, risk-seeking, and risk-neutral. Divergent risk attitudes inevitably lead to fundamentally different asset selection logic.
Multivariate Stochastic Dominance via Optimal Transport and Applications to Models Benchmarking
Stochastic dominance is an important concept in probability theory, econometrics and social choice theory for robustly modeling agents' preferences between random outcomes. While many works have been dedicated to the univariate case,little has been done in the multivariate scenario, wherein an agent has to decide between different multivariate outcomes. By exploiting a characterization of multivariate first stochastic dominance in terms of couplings, we introduce a statistic that assesses multivariate almost stochastic dominance under the framework of Optimal Transport with a smooth cost. Further, we introduce an entropic regularization of this statistic, and establish a central limit theorem (CLT) and consistency of the bootstrap procedure for the empirical statistic. Armed with this CLT, we propose a hypothesis testing framework as well as an efficient implementation using the Sinkhorn algorithm. We showcase our method in comparing and benchmarking Large Language Models that are evaluated on multiple metrics. Our multivariate stochastic dominance test allows us to capture the dependencies between the metrics in order to make an informed and statistically significant decision on the relative performance of the models.
Stochastic Dominance Constrained Optimization with S-shaped Utilities: Poor-Performance-Region Algorithm and Neural Network
We investigate the static portfolio selection problem of S-shaped and non-concave utility maximization under first-order and second-order stochastic dominance (SD) constraints. In many S-shaped utility optimization problems, one should require a liquidation boundary to guarantee the existence of a finite concave envelope function. A first-order SD (FSD) constraint can replace this requirement and provide an alternative for risk management. We explicitly solve the optimal solution under a general S-shaped utility function with a first-order stochastic dominance constraint. However, the second-order SD (SSD) constrained problem under non-concave utilities is difficult to solve analytically due to the invalidity of Sion's maxmin theorem. For this sake, we propose a numerical algorithm to obtain a plausible and sub-optimal solution for general non-concave utilities. The key idea is to detect the poor performance region with respect to the SSD constraints, characterize its structure and modify the distribution on that region to obtain (sub-)optimality. A key financial insight is that the decision maker should follow the SD constraint on the poor performance scenario while conducting the unconstrained optimal strategy otherwise. We provide numerical experiments to show that our algorithm effectively finds a sub-optimal solution in many cases. Finally, we develop an algorithm-guided piecewise-neural-network framework to learn the solution of the SSD problem, which demonstrates accelerated convergence compared to standard neural network approaches.
Center-Outward q-Dominance: A Sample-Computable Proxy for Strong Stochastic Dominance in Multi-Objective Optimisation
van der Laag, Robin, Wang, Hao, Bรคck, Thomas, Fan, Yingjie
Stochastic multi-objective optimization (SMOOP) requires ranking multivariate distributions; yet, most empirical studies perform scalarization, which loses information and is unreliable. Based on the optimal transport theory, we introduce the center-outward q-dominance relation and prove it implies strong first-order stochastic dominance (FSD). Also, we develop an empirical test procedure based on q-dominance, and derive an explicit sample size threshold, $n^*(ฮด)$, to control the Type I error. We verify the usefulness of our approach in two scenarios: (1) as a ranking method in hyperparameter tuning; (2) as a selection method in multi-objective optimization algorithms. For the former, we analyze the final stochastic Pareto sets of seven multi-objective hyperparameter tuners on the YAHPO-MO benchmark tasks with q-dominance, which allows us to compare these tuners when the expected hypervolume indicator (HVI, the most common performance metric) of the Pareto sets becomes indistinguishable. For the latter, we replace the mean value-based selection in the NSGA-II algorithm with $q$-dominance, which shows a superior convergence rate on noise-augmented ZDT benchmark problems. These results establish center-outward q-dominance as a principled, tractable foundation for seeking truly stochastically dominant solutions for SMOOPs.